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Abstract. In this paper we show that different physiological states and pathological 
conditions may be characterized in terms of predictability of time series signals from 
the underlying biological system. In particular we consider systolic arterial pressure 
time series from healthy subjects and Chronic Heart Failure patients, undergoing 
paced respiration. We model time series by the regularized least squares approach 
and quantify predictability by the leave-one-out error. We find that the entrainment 
mechanism connected to paced breath, that renders the arterial blood pressure signal 
more regular, thus more predictable, is less effective in patients, and this effect 
correlates with the seriousness of the heart failure. The leave-one-out error separates 
controls from patients and, when all orders of nonlinearity are taken into account, alive 
patients from patients for which cardiac death occurred. 
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1. Introduction 



Physiological signals derived from humans are extraordinarily complex, as they reflect 
ongoing processes involving very complicated regulation mechanisms (Glass 2001), and 
can be used to diagnose incipient pathophysiological conditions. Many approaches to 
characterization and analysis of physiological signals have been introduced in recent 
years, including, for example, studies of: Fourier spectra (Akselrod et al 1981, Pinna 
et al 2002), chaotic dynamics (Babloyantz et al 1985, Poon and Merrill 1997), wavelet 
analysis (Thurner et al 1998, Marrone et al 1999), scaling properties (Nunes Amaral et 
al , 1998, Ashkenazy et al 2001, Ivanov and Lo 2002), multifractal properties (Ivanov et 
al 1999, Nunes Amaral et al 2001), correlation integrals (Lehnertz and Elger 1998), 1// 
spectra (Peng et al 1993, Ivanov et al 2001) and synchronization properties (Schafer 
et al 1998, Tass et al 1998, Angelini et al 2004). Less attention has been paid to 
the degree of determinism (Kantz and Schreiber 1997) of a physiological time series. 
It is the purpose of the present work to show that different physiological states, or 
pathological conditions, may be characterized in terms of predictability of time series. 
In particular we consider here predictability of Systolic Blood Pressure (SAP) time series 
under paced respiration,, and show that a suitable index separates healthy subjects 
from Chronic Heart Failure (CHF) patients. Systolic blood pressure (SAP) is the 
maximal pressure within the cardiovascular system as the heart pumps blood into 
the arteries. Paced respiration (breathing is synchronized with some external signal) 
is a well-established experimental procedure to regularize and standardize respiratory 
activity during autonomic laboratory investigations (Cooke et al 1998), and a useful 
tool for relaxation and for the treatment of chronic pain and insomnia, dental and 
facial pain, etc. (Clark and Hirschman 1980, Clark and Hirschman 1990, Freedman and 
Woodward 1992). Entrainmcnt between heart and respiration rate (cardiorespiratory 
synchronization) has been detected in subjects undergoing paced respiration (Schiek et 
al 1997, Pomortsev et al 1998). Paced breathing can prevent vasovagal syncope during 
head-up tilt testing (Jauregui-Renaud et al 2003); in healthy subjects under paced 
respiration the synchronization between the main processes governing cardiovascular 
system is stronger than the synchronization in the case of spontaneous respiration 
(Prokhorov et al 2003). However, a number of important questions remain open about 
paced breathing, including the dependence on the frequency of respiration and whether 
it affects the autonomic balance. In a healthy cardiorespiratory system, the regime of 
paced respiration induces regularization of related physiological signals (Brown Troy et 
al 1993, Pinna et al 2003), in particular blood pressure time series smoothen and become 
more deterministic. To quantify this phenomenon, we face two problems at this point: 
(i) how may we model the SAP time series? (ii) what measure of predictability is the 
most suitable? In the present paper we model time series by Regularized Least Squares 
(RLS) approach (Mukherjee et al 2002). The choice of this class of models is motivated 
by the fact that it enjoys several interesting properties. The most important is that 
such models have high generalization capacity. This means that they are able to predict 
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complex signals when a finite and small number of observations of the signal itself are 
available. Moreover the degree of nonlinearity present in the modelling, introduced 
by a kernel method, may be easily controlled. Finally they allow an easy calculation 
of the leave-one-out (LOO) error (Vapnik 1998), the quantity that we use to quantify 
predictability. To our knowledge, this is the first time RLS models are used to model time 
series; our approach generalizes the classical autoregressive (AR) approach to time series 
analysis (Kantz and Schreiber 1997). It is worth mentioning that recently (Shalizi et al 
2004) a measure of self-organization, rooted on optimal predictors, has been proposed. 
In the same spirit, LOO prediction error is related to the degree of organization of the 
underlying physiological system. 



2. Method 



2.1. Regularized least squares linear models for regression 

Let us consider a set of £ independent, identically distributed data S = {(xj, yj)}^^]^, 
where Xj is the n-dimcnsional vector of input variables and yi is the scalar output 
variable. Data are drawn from an unknown probability distribution p{'x,y). The 
problem of learning consists in providing an estimator : x — > y, out of a class 
of functions F{w), called hypothesis space, parametrized by a vector w. Let us first 
consider the class of hnear functions y = w • x, where w is the n-dimensional vector of 
parameters. To provide a bias term in the linear function (to be included if x or y have 
non vanishing mean), a supplementary input variable (constant and equal to one) is to 
be included in the input vector. In the regularized least squares approach, w is chosen 
so as to minimize the following functional: 



L(w) 



^(y,- w-Xj)^ + A||w| 



(1) 



U=l 

where ||w|| = ■ w is the Euclidean norm induced by the scalar product. The 
first term in functional L is called empirical risk, the mean square prediction error 
evaluated on the training data; the second one {regularization term) can be motivated 
geometrically by the following considerations. Let us view data {'Xi,yi) as points in 
a. {n -\- l)-dimensional space. Each function y = w • x determines an hyperplane 
in this space, approximating data points. The prediction square error on point i is 
Q = (yi — w-Xj)^; let di be the square distance between the point and the approximating 
hyperplane. It is easy to see that (see fig. 1): 

This equation shows that the smaller | |w| |^, the better the deviation approximates the 
true distance rfj. Hence the role of the regularization term, whose relevance depends on 
the value of parameter A and penalizes large values of | |w| |, is to let the linear estimator 
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be chosen as the hyperplane minimizing the mean square distance with the data points. 
It is easy to minimize functional L and get the optimal hyperplane: 

w = (A + AI)"^b, (3) 

where A is the n x n matrix given by 



A=E^i^r, (4) 

i=l 

b is the n-dimensional vector given by 

e 

1=1 

while I stands for the identity matrix. 

The empirical risk = l/iY.i=i^i is not a good measure of the quality of the 
estimator. What matters is the generalization ability, i.e. the prediction error on data 
points which have not been used to train the estimator. The following measure of the 
generalization performance, known as LOO procedure, is both intuitive and statistically 
robust (one can show that LOO error is almost unbiased, see Luntz and Brailovsky 1969). 
For each i, data point i is removed from the data set. The approximating hyperplane is 
then determined on the basis of the residual set oi £ — 1 points; the square prediction 
error by this hyperplane on point i will be denoted e'°°. The LOO error is then defined as 
follows: Eioo = l/^Z]i=i ^i'"- In principle, calculation of Eioo requires the estimation oi i 
hyperplanes, thus rendering this procedure unfeasible, or at least unpractical. However 
the class of models, we are considering here, allows calculating LOO error after inversion 
of only one n x n matrix. It can be shown (Mukherjee et al 2002) that: 

where w is trained on the full data set, using Q, and G is an £ x £ matrix given by 

G = XT(A + AI)"^X; (7) 

here we denote X the n x £ matrix whose columns are input data {xi}. 

The value of the parameter A is to be tuned to minimize the LOO error. In other 
words, this free parameter is to be tuned to enhance the generahzation capability of the 
model. It is useful, for the nonlinear extension of these models, to express w as a linear 
combination of the vectors Xi for z = 1,2, Indeed, if £ > one can suppose that 
vectors {xi} span all the n-dimensional space, constituting an over-complete system of 
vectors. This means that there exist £ coefficients c = (ci,C2, ...,q)^ such that: 

w = Xc. (8) 

Simple calculations yield 

c=(K + AI)-V, (9) 
where K = X^X is a £ x £ matrix with generic element Kij = Xi ■ xj, whereas y = 
{yi,y2, ■■.,ye)^ is a. vector formed by the £ values of the output variable. The prediction 
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Figure 1. Geometrical interpretation of regularization 

y, in correspondence to an input vector x, may then be written as a sum over input 
data: 

e 

/ : X ^ ?/ = Xi ■ X. (10) 

1=1 

Equations (j9ll(J|) shows that the evaluation of the hnear predictor as well as the 
computation of the parameter vector c involve only scalar products of data in the input 
space. This property allows to extend the regularized linear models to the non linear 
case, as we describe in the next subsection. 

2.2. Nonlinear models 

The extension to the general case of non linear predictors is done by mapping the input 
vectors x in a higher dimensional space Ti, called feature space, and looking for a linear 
predictor in this new space. Let $(x) e be the image of the point x in the feature 
space, with: 

$(x) = (0i(x),02(x),...,07v(x),...) 

where {0} are real functions. Note that the number of components of the feature space 
can be finite, countable or even infinite uncountable. Moreover, suppose that one of the 
features be constant. This hypothesis allows to write the linear predictor in the feature 
space Ti. without making explicit the bias term. In the feature space induced by the 
mapping $, a linear predictor takes the form: 

y = /(x) = w ■ $(x) (11) 

where now w, according to the nature of the feature space, may have finite or infinite 
number of components. Again, we hypothesize that w may be written as a linear 
combination of the vectors $(xj) with i = 1,2, (if this hypothesis would not be met, 
we thus determine a solution, constrained in the subspace, of the feature space, spanned 
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by vectors {$(xj)}j=i ^). This means that there exist i coefficients (ci, C2, q)""" such 
that: 

I 

W = ^Q$(Xi). (12) 

i=l 

In this hypothesis, the hnear predictor in the feature space Ti. takes the form: 

l/ = /(x)=^Q$(x,)-$(x), (13) 

i=l 

and, therefore, will be non-linear in the original input variables. The vector c is given 
by (jni) with K the i x i matrix with generic element Kij = $(xi) ■ ^*(xj). Note that 
the evaluation of the predictor on new data points and the definition of the matrix K 
involve the computation of scalar products between vectors in the feature space, which 
can be computationally prohibitive if the number of features is very large. A possible 
solution to these problems consists in making the following choice: 

$(x) = (v/«T^i(x), v/a2^2(x), ^/a^'^pN{^), •••) 

where at and ipi are the eigenvalues and eigenf unctions of an integral operator whose 
kernel K{'x, y) is a positive definite symmetric function. With this choice, the scalar 
product in the feature space becomes particularly simple because 

<l>(Xi) ■ <l>(Xj) = J2 a7^7(Xi)^7(Xi) = Xj), (14) 

7 

where the last equality comes from the Mercer-Hilbert-Schmidt theorem for positive 
definite functions (Riesz and Nagy 1955). The predictor has, in this case, the form: 

e 

y = /(x)=^Qir(x„x). (15) 

i=l 

Analogously the LOO error can be calculated as follows: 

Eioo = -„}i^[ , (16) 




where the matrix G can be shown to be equal to K (K + AI)~ . Many choices of the 
kernel function are possible, for example the polynomial kernel of degree p has the form 
K{-x., y) = (1 + x ■ y)^ (the corresponding features are made of all the powers of x up to 
the p-th). The RBF Gaussian kernel is K{'x, y) = exp — (| |x — y| p/2(T^) and deals with 
all the degrees of nonlinearity of x. Specifying the kernel function K one determines the 
complexity of the function space within which we search the predictor, similarly to the 
effect of specifying the architecture of a neural network, that is number of layers, number 
of units for each layer, type of activation functions which define the set of functions that 
the neural network implements. Notice that, depending on the kernel function, we 
can have a countable or even an uncountable number of features. The last case apply, 
for example, to the Gaussian function. Use of kernel functions to implicitly perform 
projections, the kernel trick, is at the basis of Support Vector Machines, a technique 
which has found application in several fields, including Medicine (Bazzani et al 2001). 
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3. Results 

3.1. Physiological data 

Our data are from 47 healthy volunteers (age: 53 ±8 years, M/F: 40/7) and 275 patients 
with chronic heart failure (CHF) (age: 52 ± 9 years, LVEF: 28 ± 8%, NYHA class: 
2.1±0.7, M/F: 234/41), caused mainly by ischemic or idiopathic dilated cardiomyopathy 
(48% and 44% respectively), consecutively referred to the Heart Failure Unit of the 
Scientific Institute of Montescano, S. Maugeri Foundation (Italy) for evaluation and 
treatment of advanced heart failure. Concerning the second group, cardiac death 
occurred in 54 (20%) of the patients during a 3-year follow-up, while the other 221 
patients were still alive at the end of the follow-up period. All the subjects underwent a 
10 min supine resting recording in paced respiration regime (Cooke et al 1998, Rzeczinski 
et al 2002). To perform paced breathing, subjects were asked to follow a digitally 
recorded human voice inducing inspiratory and expiratory phases, at 0.25 Hz frequency. 
Non invasive recording of arterial blood pressure at the finger (Finapres device) was 
performed. For each cardiac cycle, corresponding values of SAP were computed and 
re-sampled at a frequency of 2 Hz using a cubic spline interpolation. As an example In 
Fig. 121 we report the SAP time series for one of the subjects. 
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Figure 2. The time series of the systolic arterial pressure for one of the subjects 
examined. 

Let us denote {xi}i=i^,^N the time series of SAP values, which we assume to be 
stationary (this assumption is justified by the short length of the recording). The 
models previously introduced are used to make predictions on the time series. We 
fix the length of a window m, and for k = 1 to £ (where i = N — m), we 
denote = {xk+m-i,Xk+m-2, ■■■,Xk) and yk = Xk+m] we treat these quantities as I 
realizations of the stochastic variables x (input variables) and y (output variable). In 
the preprocessing stage, the time series are normalized to have zero mean and unit 
variance, but are not filtered. We use m = 30, so that the input pattern receives 
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contributions from frequencies greater than 0.066 Hz, thus including part of LF (low 
frequency 0.04 — 0.15 Hz) and HF (high frequency 0.15 — 0.45 Hz) frequency bands, the 
major rhythms of heart rate and blood pressure variability. All the formalism previously 
described is applied to model the dependency of y from x, i.e. to forecast the time series 
on the basis of m previous values: LOO error is a robust measure of its predictability. 
We use Gaussian kernel and polynomial of 1, 2 and 3 degree. 

To show the role of the parameter A, in fig. 3 we depict, for a typical control subject, 
both the LOO error and the empirical error versus A. As A increases, the empirical risk 
monotonically increases, whilst the LOO error shows a minimum at a finite value of A 
ensuring the best generalization capability. We fix the value of A once for all subjects, by 
minimizing the average LOO error on a subset made of an equal number of control and 
CHF time series. This procedure yields A = 0.01 for Gaussian kernel and polynomial 
of 1, 2 degree, whilst for the third order polynomial kernel the optimal value we find is 
A = 0.1. II 
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Figure 3. For a typical control subject, the LOO error (continuous line) and 
the empirical error (dashed line) are represented versus A. A Gaussian kernel, 
with a = 8.5, is used. 

We thus evaluate the LOO error for all the 322 subjects (Table 1). In any case, 
healthy subjects are characterized by a smaller LOO error than patients. Moreover, dead 
CHF patients show greater LOO error than still alive patients. Hence the seriousness 
of the heart disease appears to be correlated to the LOO error. The regularized linear 
model seems to be the best model of SAP time series. We verify that LOO errors 
from controls and patients are Gaussianly distributed and check the homogeneity of the 
variances of the two groups; we apply the t-test to evaluate the probability that LOO 
error values, relative to controls and patients, were drawn from the same distribution 
(the null hypothesis) (table 2). For all kernels, the null hypothesis can be rejected, also 
after the Bonferroni correction (which lowers the threshold to 0.05/4 = 0.0125). The 

II For Gaussian kernel, also a was similarly tuned to minimize LOO error, and fixed equal 8.5. 
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Gaussian kernel shows the best separation between the two classes. We have also tested 
the separation between dead and alive patients, and the results are also displayed in 
Table 2. Only when the Gaussian kernel is used the p-value is lower than 0.0125: since 
all orders of nonlinearities contribute to the Gaussian modelling, this result suggests 
that the phenomenon here outlined is an effect with strong nonlinear contributions. 



Table 1. 


Mean values of LOO 


error. 




Kernel 


Controls 


CHF 


CHF alive 


CHF dead 


Gaussian 


0.0386 


0.0806 


0.0767 


0.0968 


1-poly 


0.0019 


0.0158 


0.0131 


0.0272 


2-poly 


0.0022 


0.0842 


0.0745 


0.1242 


3-poly 


0.0082 


0.1493 


0.1484 


0.1526 


Table 2. P-values. 


Kernel 


Controls 


vs CHF 


CHF alive vs 


CHF dead 


Gaussian 


1.03E-08 




0.0088 




1-poly 


0.0011 




0.1825 




2-poly 


0.0010 




0.1289 




3-poly 


0.0121 




0.1429 





4. Discussion 

We show that LOO prediction error of physiological time series may usefully be used 
as a measure of organization of the underlying regulation mechanisms, and can thus be 
used to detect changes of physiological state and pathological conditions. We propose 
use of RLS models to time scries prediction because they allow fast calculation of the 
LOO error and their degree of nonlinearity can be easily controlled. We consider here 
the SAP time series in healthy subjects undergoing paced breath, and in patients with 
heart disease. We find that the entrainment mechanism connected to paced breath, that 
renders the arterial blood pressure signal more deterministic, thus more predictable, is 
less effective in patients, and this effect correlates with the seriousness of the heart 
failure; paced breathing conditions seem suitable for diagnostics of a human state. In 
our opinion, the LOO error, as a measure of determinism and complexity, is a concept 
that has potential application to a wide variety of physiological and clinical time-series 
data. 
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